Construction of Thematic Representations of Texts Based on Domain-Specific Thesaurus

نویسندگان

  • Natalia V. Loukachevitch
  • Boris V. Dobrov
چکیده

The paper considers interrelations between lexical cohesion and the thematic structure of a text. The technique of automatic construction of the thematic representation of the text contexts is described. The technique uses knowledge from Sociopolitical thesaurus, which was specially developed as a tool for automatic text processing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conceptual Business Process Structuring by Extracting Knowledge from Natural Language Texts

This article discusses methods of constructing a formalized structure of a subject domain based on analysis of natural language texts, including discovering objects, their properties and related actions, followed by discovering business processes specific to the subject domain and the formation of thesaurus and business processes of the subject domain. At the same time the thesaurus can be chan...

متن کامل

How to Thematically Segemt Texts by Using Lexical Cohesion?

This article outlines a quantitative method for segmenting texts into thematically coherent units. This method relies on a network of lexical collocations to compute the thematic coherence of the different parts of a text from the lexical cohesiveness of their words. We also present the results of an experiment about locating boundaries between a series of concatened texts. 1 I n t r o d u c t ...

متن کامل

Ontologies, Taxonomies, Thesauri: Learning from Texts

The use of ontologies as representations of knowledge is widespread but their construction, until recently, has been entirely manual. We argue in this paper for the use of text corpora and automated natural language processing methods for the construction of ontologies. We delineate the challenges and present criteria for the selection of appropriate methods. We distinguish three major steps in...

متن کامل

Discovering and visualizing narrative themes

This paper presents a framework for indexing and browsing databases of stories, in particular characterizing and visually exploring each narrative’s thematic content. We introduce a method for discovering thematic content in texts via lexical dissimilarity statistics. A maximumlikelihood algorithm clusters words into pools of similar meaning, using a thesaurus for rough estimates of word sense ...

متن کامل

Automatic Ontology Extraction from Unstructured Texts

Construction of the ontology of a specific domain currently relies on the intuition of a knowledge engineer, and the typical output is a thesaurus of terms, each of which is expected to denote a concept. Ontological ‘engineers’ tend to hand-craft these thesauri on an ad-hoc basis and on a relatively smallscale. Workers in the specific domain create their own special language, and one device for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002